Automatic Acquisition of Possible Contexts for Low-Frequent Words

نویسنده

  • Silvia Necsulescu
چکیده

The present work constitutes a PhD project that aims to overcome the problem caused by data sparsity in the task of acquisition of lexical resources. In any corpus of any length, many words are infrequent, thus they co-occur with a small set of words. Nevertheless, they can co-occur with many other words. Our goal is to discover some more possible co-occurring words for low-frequent words relying on other co-occurrences observed in corpus. Our approach aims to formulate a new similarity measure, based on the words usage in language, to approve a transfer of co-occurring words, from a frequent word to a low-frequent

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Acquisition of Two-Level Morphological Rules

We describe and experimentally evaluate a complete method for the automatic acquisition of two-level rules for morphological analyzers/generators. The input to the system is sets of source-target word pairs, where the target is an inflected form of the source. There are two phases in the acquisition process: (1) segmentation of the target into morphemes and (2) determination of the optimal two-...

متن کامل

EBL2: An Approach To Automatic Lexical Acquisition

A method for automatic lexical acquisition is out lined. An existing lexicon that, in addition Io ordinary ]exical entries, contains prototypical cntrips for various non-exclusive paradigms of open-cl~,.ss words, is extended by inferring new lexical entries from texts containing unknown words. This is done by comparing the constraints placed on the unknown words hy the natural language system's...

متن کامل

Automatic Acquisition of Synonyms Using the Web as a Corpus

We present an original algorithm for automatic acquisition of synonyms from text. The algorithm measures the semantic similarity between pairs of words by comparing their local contexts extracted from the Web by series of queries against the Google search engine. The results show 11pt average precision of 63.16%.

متن کامل

Context Feature Selection for Distributional Similarity

Distributional similarity is a widely used concept to capture the semantic relatedness of words in various NLP tasks. However, accurate similarity calculation requires a large number of contexts, which leads to impractically high computational complexity. To alleviate the problem, we have investigated the effectiveness of automatic context selection by applying feature selection methods explore...

متن کامل

Semantic Content Acquisition and Representation ( SCAR ) 2007

Given a target word wi to be disambiguated, we define a class of local contexts for wi such that the sense of wi is univocally determined. We call such local contexts sense discriminative and represent them with sense discriminative (SD) patterns of lexico-syntactic features. We describe an algorithm for the automatic acquisition of minimal SD patterns based on training data in SemCor. We have ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011